Humans are a critical part of operating systems at scale, yet we rarely pay much attention to them. Most of the time, energy and investment goes into picking the right technologies, the right hardware, the right APIs. But what about the people actually building and scaling those systems? How can you reach your high availability goals without a team that is able to build reliable systems, and respond when things go wrong? How does sleep and fatigue affect system uptime? System errors are tracked, but what about human error? Can it be measured, and mitigated? How does work get prioritised?
This talk will consider the principles and philosophy of HumanOps - focusing on the human side of running infrastructure. It is crucial to consider human system and process design as part of any large scale software, hardware and infrastructure project. This talk will explain how.